Biological Imaging
◐ Cambridge University Press (CUP)
Preprints posted in the last 30 days, ranked by how well they match Biological Imaging's content profile, based on 15 papers previously published here. The average preprint has a 0.01% match score for this journal, so anything above that is already an above-average fit.
Hogendorn, C.; R. Aragon, I.; Dallon, S.; Batchelor, E.
Show abstract
To properly respond to their environment, cells adjust the activity of key regulatory proteins and rates of gene expression. Methods to detect and quantify these forms of regulatory dynamics in living cells are of central importance for understanding cellular signaling events in both physiological and pathological conditions. Current technologies in this field make use of fluorescent probes to track cell signaling dynamics. Although these technologies have been used for decades, challenges remain. In particular, the segmentation, tracking, and interpretation of single cell dynamic data are time-consuming, prone to subjective errors, and often lacking in standardization across experiments. Here, we present SPIFEE, a data pipeline that uses experiment-dependent parameters to smooth noise and quantify key features of fluorescence data from time-lapse imaging studies. Processing data in this manner enhances and accelerates quantification of live-cell gene and protein expression, simplifies data analysis, and facilitates hypothesis generation. Author SummaryCells adjust protein activity and gene expression levels over time to respond to changes in their environment, a process referred to as cell signaling dynamics. Quantifying cell signaling dynamics in living cells often uses fluorescent probes, such as green fluorescent protein (GFP) and its spectral variants, to track changes in gene expression or protein activity over time. Challenges inherent in analyzing fluorescence data from single cells stem from biological and experimental noise, time-consuming quantification, and subjective errors. To address these challenges, we developed a computational tool called Signal Processing and Integrated Feature Extraction (SPIFEE). The pipeline improves the quality of fluorescence data analysis by reducing noise and extracting signal features in a way that is both intuitive and objective. The pipeline provides more accurate, rapid, and unbiased quantification of time-lapse microscopy data.
Le, T. X.; Tran, L.-A. T.; Farabi, D. A.; Wang, S.; Phan, A. T. Q.; Cormier, S. A.; Taada, A.; McGrew, D.; Du, Y.; Vu, L. D.
Show abstract
Automated analysis of murine bronchoalveolar lavage fluid (BALF) cytology is important for preclinical respiratory research, yet progress has been limited by the lack of publicly available, well-annotated mouse BALF image datasets. We present MurineCyto-Det, a high-resolution murine BALF cytology dataset comprising 333 image tiles of size 1024x1024 pixels, annotated across five cytological categories with both pixel-level segmentation masks and one-to-one matched bounding boxes. The dataset contains 14,551 annotated cell instances and supports two complementary analysis tasks: morphology-oriented cell segmentation and object-level cell detection. To establish reproducible benchmark baselines, we evaluated representative segmentation and detection models. The results demonstrate the practical utility of MurineCyto-Det while highlighting realistic challenges arising from class imbalance, small object size, irregular cell morphology, and ambiguous debris-like structures. MurineCyto-Det provides a standardized resource for developing, evaluating, and comparing automated methods for murine BALF cytology analysis. The dataset is publicly available at https://doi.org/10.5281/zenodo.17608677.
O'Roberts, E.; Panshikar, P. R.; Li-Wang, X.; Avenel, C.; Verron, Q.; Coulier, E.; Bienko, M.; Stadler, C.
Show abstract
Different omics types such as genomics and proteomics all contribute to deciphering biology. Applying these omics approaches in a spatial context helps reveal biology in situ at a single cell level. Here we present a protocol for the combined multiplexed detection of targeted genes using DNA FISH, and proteins using multiplexed immunofluorescence. The protocol is integrated on the commercial PhenoCycler platform and generates one single dataset with gene and protein readout at a single cell level in large tissue sections, allowing for a throughput of thousands to millions of cells. The workflow can be used for characterising malignant cells in large tumor areas based on genetic aberrations, while deciphering the cellular landscape and microenvironment from multiplexed protein detection using immunofluorescence.
Wilsenach, J. B.; Fonseca, S.; Ahnert, S. E.; Wojtowicz, E. E.
Show abstract
BackgroundImaging flow cytometry (IFC) provides a high quantity of single-cell morphological data, yet the field lacks open access tools for designing interpretable, bespoke parameters. In particular, rare and atypical cell populations where well annotated data is limited, are negatively affected. ResultsWe present Flow cytometry Feature Importance (FlowFI), an open-source graphical software for bespoke image parameter design and analysis. FlowFI provides a suite of image parameter options combining data across multiple channels and markers, tailored digital noise reduction (reducing noise resulting from common flow cytometry ultra-high image acquisition modalities), and a scalable, unsupervised feature selection pipeline that allows experimentalists to refine image-derived parameters iteratively, with a novel ensemble subsampling approach that provides robust feature importance scoring. We validated FlowFI using data from a rare and heterogenous bone marrow cell type, megakaryocytes, demonstrating that the tool can successfully identify novel, discriminatory morphological features to improve the purity of selected cell populations and gating strategy. ConclusionFlowFIs core functionalities are interacted with through an intuitive user interface for researchers with options to export data directly to common image and flow cytometry software formats. With this in mind, FlowFI offers a scalable way to both feature design, and feature refinement using a range of approaches to manifold learning, augmented by a data efficient bootstrap subsampling approach for unsupervised parameter recommendations in the big data regime. The software also introduces a new feature selection measures based on common manifold learning methods in the space inspired by the Uniform Manifold Approximation and Projection (UMAP) algorithm and finds performance comparable to existing methods. FlowFI provides a versatile testing ground for future developments in broad and dynamically developing areas of research including single cell analysis, label-free sorting and intra- and inter-cellular interaction analysis, while ensuring interoperability with current research workflows. Desktop installation options as well as detailed documentation can be found at https://github.com/EarlhamInst/FlowFI
Ganz, M.; Norgaard, M.; Pernet, C.; Matheson, G. J.; Galassi, A.; Ceballos, E. G.; Wighton, P.; Bilgel, M.; Eierud, C.; Gonzalez-Escamilla, G.; Buckholtz, J.; Blair, R.; Markiewicz, C. J.; Hardcastle, N.; Greve, D. N.; Thomas, A. G.; Poldrack, R. A.; Calhoun, V. D.; Innis, R. B.; Knudsen, G. M.
Show abstract
Molecular neuroimaging with positron emission tomography (PET) and single-photon emission computed tomography (SPECT) enables quantification of specific molecular targets in the living brain. Despite its scientific impact, molecular neuroimaging research has historically faced challenges due to high costs, small sample sizes, laboratory-specific analysis pipelines, and limited large-scale data sharing. These factors have hindered reproducibility and the broader reuse of valuable PET datasets. The OpenNeuroPET initiative was established to address these barriers by developing standards, infrastructure, and open-source tools for organizing, sharing, and analyzing molecular neuroimaging data. Through collaborations across Europe and North America, OpenNeuroPET has supported the PET extension of the Brain Imaging Data Structure (PET-BIDS), providing a standardized framework for PET datasets and metadata. Building on PET-BIDS, tools such as PET2BIDS, ezBIDS, and BIDSCoin facilitate data conversion and curation. In parallel, OpenNeuro now hosts PET-BIDS datasets for open sharing, while complementary platforms such as PublicnEUro enable GDPR-compliant controlled access. Emerging open-source workflows and BIDS applications further support automated, reproducible PET preprocessing and quantitative analysis, promoting harmonized processing across centers. Together, these developments mark an important step toward an open molecular neuroimaging ecosystem in which datasets, software, and workflows can be transparently shared, reused, and scaled for collaborative research.
Keding, L. T.; Liu, R.-Y.; Keding, T. J.; Vazquez, J.; Bockoven, C. G.; Shah, D. M.; Golos, T. G.; Wieben, O.; Stanic, A. K.
Show abstract
IntroductionHealthy and diseased placentae alike often display some degree of pathology. However, quantitative techniques to characterize common pathologies and their relationship to local maternal hemodynamics in healthy primate placentae are currently limited. MethodsPlacentae from seven rhesus macaques were imaged by MRI at three time points across mid-to late-gestation, to quantify placental blood volume, flow, and perfusion from maternal spiral arteries across pregnancy. Near term, we collected placental cotyledons, digitized hematoxylin/eosin-stained slides, then segmented and annotated sub-tissues and major pathologies (intervillous gaps, fibrin deposition, villous agglutination, inflammatory agglutination, and stromal mineralization) within each cotyledon. Individual pathologies were assessed in relation to each other and MRI perfusion metrics, in a cotyledon-specific manner. Parallel analyses were performed to investigate both basic (Spearman correlation) and animal variance-negated (dimensionality-reduction) relationships. ResultsCotyledons with increased stromal mineralization demonstrated low blood perfusion across pregnancy, alongside significant compensatory changes. Mineralization was further associated with decreased fetal weight, across all sub-tissues. Dimensionality reduction revealed maternal vascular malperfusion-associated pathologies as the largest contributor to dataset variance. Additionally, pathologies commonly associated with healthy placental function demonstrated low cotyledon blood flow and volume at all timepoints, with no evidence of compensatory changes across gestation. ConclusionsComprehensive digital annotation revealed several relationships connecting pathology and maternal blood perfusion in the healthy primate pregnancy, at the smallest functional unit of the placenta. This methodological framework embeds pathologist-refined morphological expertise into a quantitative, spatially resolved format that can ground, rather than be replaced by, unsupervised computational approaches to placental analysis.
Staller, S. A.; Valentine, V.; Burden, S.
Show abstract
SummarySequential multiplexed fluorescence in situ hybridization (FISH) enables spatially resolved molecular profiling in cell monolayers, but analyzing puncta colocalization across three-dimensional (3D) datasets remains a labor-intensive bottleneck. zFISHer is an open-source application built on the napari viewer that provides complete automation of sequential FISH image processing in conjunction with interactive user-curation tools. zFISHer provides end-to-end analysis of paired FISH datasets, encompassing nuclear segmentation, automated puncta detection on unaligned z-stacks, multi-round image registration via translation-constrained RANSAC with optional B-spline deformable warping, precise transformation of puncta coordinates into aligned space, consensus nuclei generation, interactive editing with real-time collision detection, and pairwise and tri-channel colocalization analysis with statistics. This includes a "Fishing Hook" raycasting algorithm that enables users to locate puncta at their true 3D centroids by identifying intensity maxima along the camera ray, eliminating manual z-slice navigation, complemented by a sub-voxel volume optimization. The included batch processing mode enables high-throughput unattended analysis of multiple experimental datasets. Availability and ImplementationzFISHer is open source under the MIT license, freely available on GitHub: https://github.com/stjude/zFISHer. The example dataset (deconvolved ND2 image stacks) is archived on Zenodo at https://doi.org/10.5281/zenodo.20288536. zFISHer is developed in Python utilizing the napari viewer for the interface. Documentation and expected test outputs for the sample dataset are available on the GitHub: https://github.com/stjude/zFISHer. To report an issue using zFISHer or contributing to it, please file an issue in the GitHub repository: https://github.com/stjude/zFISHer/issues. ContactSeth.Staller@STJUDE.ORG Supplementary InformationSupplementary data are available online.
Dong, Y.; Yang, Z.; Schneider, M.; Scherzer, O.; Schuetz, G.
Show abstract
We introduce a workflow to identify oligomeric structures that are recorded with single-molecule localization microscopy (SMLM) under cryogenic conditions. Typically, these oligomers are assumed to consist of protomers arranged as equilateral two-dimensional polygons and every protomer is labeled with a dye molecule for visualization. Unlike previous work, we consider scenarios in which the sample plane has an unknown orientation relative to the focal plane. Our contribution is a high-precision plane-fitting algorithm to determine the sample plane, combined with geometrical transformations and two circle-fitting algorithms to identify the oligomeric structures. Our simulations on synthetic data demonstrate that the proposed workflow achieves high accuracy in estimating both the unknown tilted plane and the oligomer size.
Ali, M.; Hutchings, J.; Dutta, T.; Jean, N.; Greenan, G.; Montabana, E. A.; Schwartz, J.; Finn, M. G.; Haury, M.; Agard, D.; Carragher, B.; Kopylov, M.; Paraan, M.
Show abstract
Standardized biological specimens are essential for optimizing cryoEM workflows and benchmarking instrument performance. While apoferritin fulfills this role for single-particle analysis, no equivalent exists for cryo-electron tomography. Ribosomes are frequently used but require large datasets due to C1 symmetry and structural heterogeneity, limiting rapid optimization and standardized comparison of workflows. Here, we present PP7 virus-like particles (VLPs) overexpressed in E. coli as a scalable in situ benchmark. VLPs have high orders of symmetry enabling rapid, high-resolution validation of tomographic pipelines from minimal datasets, while their distinct structural features across low to high resolutions provide a practical resolution metric.
Das, A.; Ahammer, H.; Prabhu, J. S.; Bhat, R.; Jolly, M. K.
Show abstract
Quantitative biophysical signatures of nuclear spatial reorganisation across breast carcinoma progression remain insufficiently characterised. We apply two complementary fractal descriptors, Correlation dimension (Dc) and Minkowski dimension (Dm), to 4276 regions of interest across seven breast tissue subtypes from the BRACS dataset, validating observed dimensions against systematically constructed null spatial models to distinguish genuine structural organisation from geometric irregularity. All subtypes significantly exceed the complete spatial randomness baseline, confirming universal departure from random nuclear arrangement. The observed scaling is characterised as statistically monofractal within a bounded pre-fractal range. Invasive carcinoma uniquely fails to exceed the clustered null in Dc while simultaneously showing the weakest Dm null deviation, a dual convergence toward stochastic baselines consistent with the progressive removal of architectural constraints. Flat epithelial atypia exhibits a unique directional dissociation with the lowest Dc across all subtypes combined with high Dm null deviation, a co-occurrence not observed in any other subtype and geometrically consistent with decoupled nuclear spatial organisation at the centroid distribution and boundary morphology scales. Interpreted within a percolation-theoretic framework, the non-monotonic null deviation trajectory maps onto qualitative regime transitions, providing a physically grounded explanation for the observed discrimination profile across pathological transitions. These findings position fractal-like nuclear architecture as a potential descriptor for pre-malignant transitional states.
Heine, J.; Fowler, E.; Eschrich, S. A.; Schell, M.
Show abstract
Data modeling in biomedical research often operates in the small-sample regime, where the number of observations is small relative to the data dimensionality; the detrimental effects of limited sample sizes are well documented in cancer studies. Synthetic data offers a potential solution to data shortfalls provided that the data generated is an adequate facsimile of the underlying distribution; the adequacy of such synthetic data remains an open-ended problem. In this work, we evaluate a synthetic generator proposed previously. The generator applies a series of transformations to the observed data to accommodate the small-sample size resulting in an uncoupled representation, where uncorrelated marginal distributions are modeled with optimized univariate kernel density estimation. In this report, (1) we develop a nonparametric method for assessing multivariate similarity based on the Cramer-Wold theorem and random projection testing, (2) investigate when the absence of bivariate correlation approximates independence in a non-normal setting, and (3) evaluate artifacts induced by data compression. The presentation is primarily methodological; low-dimensional data were used so each stage of the generation process could be analyzed explicitly. A formal testing framework was developed by comparing random projection level outcomes with a two-sample test, modeling these outcomes as Bernoulli trials, aggregating replicate outcomes within each projection direction, and pooling outcomes across many directions, yielding a scalable standardized normal test-statistic. The key innovation was decoupling the two-sample test significance level from that governing finalized normal inference. We showed the same projection framework also evaluates the full multivariate covariance structure. The generator produced high-fidelity multivariate synthetic data when the bivariate correlation approximates independence in the non-normal setting; in highly compressed data, residual modes were best modeled as normally distributed regardless of their intrinsic distributional form. Ongoing work includes applying these methods to higher-dimensional, diverse data.
Jung, K. J.; Qiu, J.; Cho, S.; McDonough, E.; Chadwick, C.; Ghose, S.; West, R. B.; Brooks, J. D.; Ginty, F.; Machiraju, R.; Mallick, P.
Show abstract
Accurate prognostic assessment of prostate cancer (PCa) requires an integrated understanding of tissue morphology-encompassing cell structure, glandular architecture, and tissue organization-and the immune environment. We present Prostate-TriMod, a novel tri-modal histology dataset designed to integrate high-resolution visual morphology with spatial tissue maps, immune infiltration patterns, and clinical outcomes. This dataset, generated from the Cell DIVE multiplexed imaging platform, consists of three synchronized modalities: (1) multiscale virtual H&E tiles (224px, 256px, 512px, and 2040px) providing visual morphological context, (2) spatial tissue maps identifying cancerous/non-cancerous epithelial cells, stroma and immune cell populations (via TOPAZ and CAT models), and (3) text captions generated from single-cell data and patterns. The dataset includes comprehensive clinical annotations, including Grade Groups and biochemical recurrence (BCR) status. By providing high-fidelity alignment between visual features, spatial tissue maps, and textual descriptions, Prostate-TriMod empowers the development of advanced multimodal AI frameworks. We expect this resource to support reuse in multimodal representation learning, spatial analysis, and benchmarking studies that link histology morphology and immune context to clinical outcomes in prostate cancer.
Putta, S.; Jensen, W.; Devakonda, S.; Pennell, L.; Croteau, J.
Show abstract
High-dimensional single-cell technologies, such as flow cytometry and CITE-Seq, typically rely on established lineage markers to define cell identities. Additional markers are commonly analyzed within the context of these predefined cell types. Nonlinear projection methods such as t-SNE and UMAP provide a visual framework for this analysis by enabling the overlay of cell types and marker expression. However, these methods frequently produce projections where distinct cell types substantially overlap, hindering interpretation of marker expression patterns relative to known cell types. In this study, we investigate the underlying causes of this phenomenon and demonstrate that such overlaps often stem from the inherent high-dimensional structure of the data rather than limitations in the dimensionality reduction algorithms themselves. To address this, we introduce Cell Type Weighted Dimensionality Reduction (CWDR), a novel approach that incorporates lineage-based information through a supervised weighting mechanism. By integrating both cell identity and marker expression, CWDR preserves the visual separation between predefined cell types while maintaining the local variance necessary for downstream analysis. We validate our method across multiple high-dimensional flow cytometry and proteogenomic datasets. Our results show that CWDR significantly reduces inter-cluster overlap compared to traditional methods, providing a clearer framework for visualizing marker expression within the context of specific cell lineages.
Wong, A. Y. H.; Lu, Y. D.; Zhao, Z.; Zhou, F.; Park, H.; Maliga, z.; Anang, Y.; Coy, S.; Danuser, G.; Santagata, S.; Yapp, C.; Sorger, P. K.
Show abstract
The tissue-resident immune system involves complex 3D assemblies that interact with extended structures such as blood vessels and nerves. These interactions are difficult to study using conventional 2D profiling because they span many tissue sections. In animal tissues, volumetric imaging approaches such as light-sheet fluorescence microscopy (LSFM) are widely used to study 3D tissue organization, with labelling often aided by genetically encoded reporters and vascular dyes. In contrast, LSFM of human specimens remains underdeveloped because most clinical samples are available only as formalin-fixed paraffin-embedded (FFPE) tissue, limiting labeling strategies primarily to dyes and antibodies. Here, we present a volumetric cyclic immunofluorescence (v-CyCIF) and virtual H&E toolbox that overcomes key barriers to multiplexed imaging of immune cells and nerves in human specimens up to 1 mm thick. We use v-CyCIF to study neuroimmune interactions in normal and cancer tissues and to immunoprofile intact secondary and tertiary lymphoid structures. Re-embedding and sectioning of specimens following volumetric imaging enables high-plex high-resolution analysis of subcellular structures and cell-cell interactions associated with immune cell activity. v-CyCIF therefore provides a flexible framework for multi-scale 3D profiling of clinical specimens across imaging formats and resolutions.
von Zuben de Valega Negrao, C.; Hendrick, H.; Ammar, F.; V. Klotz, R.; Dias, S.; Yu, M.
Show abstract
Metastasis remains the major cause of cancer-related mortality, and circulating tumor cells (CTCs) are both candidate liquid-biopsy biomarkers and plausible intermediates of metastatic dissemination. Because CTCs are extremely rare in peripheral blood, platform comparisons have often focused solely on recovery. That focus is insufficient for applications that depend on the quality of the recovered material, including single-cell profiling, short-term culture, and functional testing. Here, we compared four CTC isolation approaches: TellDx CTC System, Genesis System, RosetteSep, and flow cytometry, using spike-in experiments in human blood. Capture efficiency was evaluated across all four platforms; purity was assessed for TellDx, Genesis, and RosetteSep; and post-isolation GFP signal persistence in culture was assessed for TellDx and Genesis as an exploratory proxy for short-term post-isolation preservation. Under the conditions tested, TellDx showed the highest recovery (88.1% {+/-} 3.7%), followed by Genesis (40.6% {+/-} 12.1%), RosetteSep (36.5% {+/-} 9.0%), and flow cytometry (7.6% {+/-} 4.5%). TellDx also showed the highest purity score (3.76), whereas Genesis (2.25) and RosetteSep (2.09) did not differ substantially. In the short-term culture assay, TellDx-derived samples retained a higher normalized GFP signal than Genesis-derived samples at 48 h and 72 h. To synthesize these readouts, we propose the Recovery Performance Index (RPI), a composite score integrating recovery, purity, and post-isolation signal persistence. Within this experimental framework, TellDx achieved the highest RPI. These data support two conclusions. First, platform benchmarking for CTC workflows benefits from multidimensional evaluation rather than recovery alone. Second, under this spike-in model and within the specific workflows used here, TellDx performed best among the platforms tested. The principal contribution of this study is therefore the establishment of a practical benchmarking framework that can be expanded in future work using clinical samples, multiple CTC phenotypes, and orthogonal viability assays.
Hellingman, A.; Gumpp, C.; Möhrle, J. J.; Tornesi, B.; Leroy, D.; Wittlin, S.; Maeser, P.; Brancucci, N. M. B.; Wicha, S.; Rottmann, M.
Show abstract
Malaria remains a major global health challenge, with emerging partial resistance to first-line therapies in Africa threatening current control efforts. Drug combinations are essential to improve treatment efficacy and restrain resistance development. However, in vitro assays that quantify parasite viability after drug exposure and characterize pharmacodynamic drug interactions are labor- and resource-intensive, with standard approaches such as the parasite reduction ratio assay limiting systematic, high-resolution evaluation of drug combinations. We present the MUltidimensional Luminescence Test for integration of interactions (MULT-i2), an in vitro assay that enables scalable, high-resolution assessment of parasite viability across multidimensional drug concentration spaces. For dual drug combinations, the MULT-i2 assay characterizes interaction surfaces while requiring [~]50-fold fewer resources and more than two-fold less time than conventional methods, enabling exploration of broader combination scenarios. The assay combines a highly sensitive chemiluminescence readout with inducible reporter expression in Plasmodium falciparum, supporting potential extension to multidimensional combination testing. Using the general pharmacodynamic interaction (GPDI) model, the MULT-i2 assay quantified interaction potency and directionality, confirming and refining the known synergy between atovaquone and proguanil, and revealing detailed interaction patterns for additional drug combinations. Overall, this approach provides an efficient framework for testing and characterizing pharmacodynamic drug interactions and supports the rational development of antimalarial combination therapies.
Alchaar, M.; Dogan, B.
Show abstract
Dimensionality reduction for visualization is a fundamental step in single-cell RNA sequencing (scRNA-seq) analysis due to the extremely high dimensionality of gene expression profiles. However, widely used nonlinear embedding techniques such as UMAP and t-SNE can introduce substantial distortions when projecting data into two-dimensional space, potentially altering global organization, local neighborhoods, and distance relationships in ways that may mislead downstream biological interpretation. In this study, we investigate the applicability of Clustering-Based Manifold Approximation and Projection (CBMAP) for the visualization of scRNA-seq data and systematically examine how clustering strategies influence the quality of the resulting embeddings. CBMAP was integrated with several clustering algorithms commonly used in single-cell analysis, including k-means, Leiden, HDBSCAN, Secuer, HGC, and FlowSOM. The resulting embeddings were evaluated using quantitative metrics that measure global, local, and distance-level structure preservation and were compared with widely used dimensionality reduction methods such as UMAP, t-SNE, and PaCMAP across multiple benchmark datasets. Our results demonstrate that the clustering stage plays a critical role in determining the structural fidelity of CBMAP embeddings. Clustering algorithms specifically designed for single-cell transcriptomic data, particularly Secuer, produced more consistent preservation of global relationships between cell populations. Across multiple datasets, CBMAP more faithfully preserved global structural organization and inter-population distance relationships than the compared methods, although local neighborhood preservation was generally weaker than in techniques optimized for local structure. Importantly, CBMAP embeddings retained biologically meaningful relationships in trajectory benchmark datasets. When combined with RNA velocity analysis, CBMAP successfully preserved cyclic progenitor states and branching differentiation trajectories, demonstrating compatibility with trajectory-aware visualization. These findings indicate that CBMAP provides a structure-faithful visualization framework for scRNA-seq data and that clustering selection plays a central role in determining embedding quality.
Xia, T.; ISLAM, S. M. S.; Xie, Z.; Zhao, X.; Zhi, D.
Show abstract
Unsupervised deep-learning image phenotypes derived from brain MRI are propelling imaging genetics to link brain structure to genetic variation. However, their replicability across data sets has not been sufficiently evaluated, raising questions about whether they capture robust biological structure or reflect training-specific artifacts. Here, we assess the replicability of unsupervised deep-learning image phenotypes under variation in model initialization, data partitioning, and cohort, directly evaluating their stability across experimental conditions. We trained multiple models under (i) different training batch random seeds, (ii) cross-validation splits, and (iii) independent datasets (UKB and ADNI), across CNN and ViT architectures. We then derived representations from a separate UKB discovery cohort (N = 22,985) for both trained models and random initialized models without training. The representation stability was assessed using centered kernel alignment (CKA; mean ViT 0.74 vs random 0.27) and kernel canonical correlation analysis (KCCA; mean ViT 0.84 vs random 0.60), as well as genetic discovery stability using loci overlap ratio (mean ViT 0.45 vs random 0.08). We further applied weighted MAXVAR generalized CCA to 12 embeddings to extract a shared 30-dimensional subspace. Our result showed that UDIPs exhibit statistically significant stability (CKA, KCCA t test p < 0.001) across training perturbations and preserve biologically meaningful structure (loci overlap ratio t test p <0.001) across cohorts, supporting their use in imaging genetics.
Dai, W.
Show abstract
Spatial transcriptomics (ST) enables transcriptome profiling with preserved spatial context, providing spatial dimensions that are essential for understanding complex intercellular signals in tissue architecture. ST-based CCC tools integrate spatial and molecular information to decipher intercellular interactions from a spatially informed perspective. Despite the rapid evolution of many CCC computational tools, a systematic assessment of their performance in handling ST-specific heterogeneity, utilizing spatial information efficiently, and robustness against technical or biological noise is still lacking. To address this gap, SpatialCCCbench incorporates classification accuracy, spatial signal features, robustness, and user-friendliness, aiming to guide the selection of optimal CCC inference tools across diverse spatial biology contexts. SpatialCCCbench systematically evaluates the scenario-specific applicability of ST-based CCC tools. It helps users select tools according to their analytical objectives and provides a practical benchmark for future method development. HighlightsO_LIEstablished a multi-dimensional benchmark suite to evaluate cell-cell communication (CCC) inference methods in spatial transcriptomics. C_LIO_LICharacterized the spatial patterns of CCC signals across diverse tissues using spatial autocorrelation and local diversity analysis. C_LIO_LISystematically assessed the robustness of CCC inference tools across six common experimental noise scenarios in spatial transcriptomics. C_LIO_LIIntegrated boundary-feature analysis, a mechanistically important component for biological interpretation, to uncover spatial preferences and algorithmic biases in CCC methods. C_LIO_LIProvided guidelines to assist in the selection of optimal CCC inference tools tailored to various spatial biology contexts. C_LI Graphic Abstract O_FIG O_LINKSMALLFIG WIDTH=200 HEIGHT=139 SRC="FIGDIR/small/724475v1_ufig1.gif" ALT="Figure 1"> View larger version (52K): org.highwire.dtl.DTLVardef@12bbc6aorg.highwire.dtl.DTLVardef@5eee6borg.highwire.dtl.DTLVardef@76d8f2org.highwire.dtl.DTLVardef@9d077e_HPS_FORMAT_FIGEXP M_FIG C_FIG
Kinman, L. F.; Grassetti, A. V.; Carreira, M. V.; Davis, J. H.
Show abstract
The emergence of single-particle cryoEM as a powerful method for structure determination has in large part been fueled by its ability to resolve both single static structures and complex conformational landscapes. Indeed, modern approaches to the heterogeneous reconstruction task can resolve 100s-1,000s of different maps from a single cryoEM dataset. How accurate these algorithms are, however, has proven difficult to rigorously assess, due to a lack of suitable benchmark datasets containing both realistic noise features and ground-truth labels. To address this obstacle, we recently developed a series of benchmark datasets that leverage the targeting power of Cas9 and the programmable heterogeneity of DNA to newly offer access to ground-truth per-particle structural labels in real data. Here, we challenged two popular heterogeneous reconstruction algorithms with mixed particle stacks resampled in silico from these datasets, finding that existing approaches resolve the encoded heterogeneity with limited accuracy. In particular, in realistic particle stacks with complex, multi-scale, and multi-axis heterogeneity, we observed that reconstruction of encoded heterogeneity depended strongly on the application of prior information about where heterogeneity was expected, and that individual particle assignments were made with significant error even when the correct structural states were reconstructed. Both molecular breathing motions and data collection features, such as defocus and projection angle, contributed to the observed particle assignment error. These results highlight important shortcomings of existing heterogeneous reconstruction methods and suggest new avenues for method development in both data collection strategies and in heterogeneous classification and reconstruction algorithms.